Capstone Project - The Battle of Neighborhoods

Kyle Bao | 10 June 2020 | IBM Applied Data Science Capstone Project

Introduction

Singapore is a multicultural city-state with a vibrant food scene. There are numerous restaurants serving cuisines from around the world, contributing to the city's high-quality diversified food scene. Due to the diversity of the food scene in Singapore, visitors may be spoilt for choices when deciding which eateries to visit. A map showing the distribution of high-quality restaurants and eateries in Singapore will be helpful in planning a culinary adventure through the city.

Business Problem

  • How can we provide a map showing the distribution of high-quality restaurants and eateries in Singapore?
  • Can we facilitate the filtering and mapping of different types of eateries?

Data

Instead of using data on the boundaries of neighborhoods in Singapore, we will use location data of the stations of the Mass Rapid Transit (MRT) train network. Not only does this serve as a proxy for neighborhoods, the resultant data product will also be more useful to a tourist visiting the city. Due to the cheap and efficient MRT train network, it is effortless for one to utilise the train network to plan one's itinerary and visit different parts of the city. Using the location data of the train stations, we will search for restaurants within walking distance of the station. The resultant map will be a map of high quality restaurants within walking distance of the train stations.

Sources

Location data for the train stations are obtained from the following Wikipedia page:

We will only use data for stations that are currently in operation.

The longitude and latitude coordinates for the stations will be obtained via the Nominatim geocoder for OpenStreetMap data from the geopy Python package.

The Foursquare API will be used to extract the list of restaurants around each station, as well as the restaurants' ratings and cuisine categories.

Finally, the map data used is provided by the folium package.

Data Extraction and Cleaning

The raw list of Singapore MRT stations extracted from the Wikipedia page is messy and contains numerous irrelevant data such as subheadings, names of non-existing stations that are planned for future expansions, as well as duplicate entries. Consequently, I cleaned the data by firstly removing the subheadings. The data contains the opening dates of the stations. To drop all stations that are planned to be opened after 2020, I first have to extract the year from the date strings using regular expressions, convert them to numeric, and then simply apply a filter. Some station names contain translated names in Malay. These were also cleaned and only the English names of the stations remain.

Next, I used the Nominatim geocoder to obtain the latitude and longitude data for the individual stations.

To obtain the list of restaurants via the Foursquare API, the ?explore endpoint was used. An additional parameter I included in the query is &section=food. Since we are only interested in restaurants and eateries, this additional parameter will ensure that all the reported venues from the API are eateries. This allows me to maximise the usage of valuable limited API calls. Additional details on the section parameter can be found on the official API documentations. The API also returns the detailed venue category of each venue, as well as its latitude and longitude.

Data Exploration

Several stations have more restaurants and eateries within a 500m radius.

Venue
Station Name
Admiralty 9
Aljunied 38
Ang Mo Kio 39
Bartley 8
Bayfront 34
Beauty World 77
Bedok 57
Bedok North 12
Bedok Reservoir 4
Bencoolen 53
Bendemeer 22
Bishan 30
Boon Keng 21
Boon Lay 69
Botanic Gardens 32
Braddell 42
Bras Basah 69
Buangkok 21
Bugis 100
Bukit Batok 19
Bukit Gombak 21
Bukit Panjang 54
Buona Vista 45
Caldecott 4
Canberra 4
Cashew 6
Changi Airport 43
Chinatown 100
Chinese Garden 11
Choa Chu Kang 31
City Hall 53
Clarke Quay 100
Clementi 59
Commonwealth 38
Dakota 25
Dhoby Ghaut 57
Dover 16
Downtown 77
Esplanade 100
Eunos 19
Expo 73
Farrer Park 51
Farrer Road 13
Fort Canning 100
Geylang Bahru 8
Gul Circle 6
HarbourFront 100
Haw Par Villa 8
Hillview 25
Holland Village 46
Hougang 32
Jalan Besar 41
Joo Koon 16
Jurong East 64
Kaki Bukit 24
Kallang 16
Kembangan 34
Kent Ridge 30
Khatib 20
King Albert Park 11
Kovan 42
Kranji 5
Labrador Park 21
Lakeside 10
Lavender 70
Little India 34
Lorong Chuan 11
MacPherson 35
Marina Bay 49
Marina South Pier 5
Marsiling 27
Marymount 11
Mattar 27
Mountbatten 54
Newton 20
Nicoll Highway 27
Novena 80
Orchard 100
Outram Park 61
Pasir Panjang 23
Pasir Ris 38
Paya Lebar 73
Pioneer 7
Potong Pasir 44
Promenade 43
Punggol 62
Queenstown 20
Raffles Place 39
Redhill 28
Rochor 35
Sembawang 33
Sengkang 35
Serangoon 36
Simei 22
Sixth Avenue 26
Somerset 42
Stadium 28
Stevens 11
Tai Seng 61
Tampines 77
Tampines East 16
Tampines West 18
Tan Kah Kee 13
Tanah Merah 4
Tanjong Pagar 100
Telok Ayer 84
Telok Blangah 21
Tiong Bahru 30
Toa Payoh 35
Tuas Crescent 3
Tuas Link 3
Tuas West Road 4
Ubi 12
Upper Changi 11
Woodlands 57
Woodlands North 5
Woodlands South 11
Woodleigh 5
Yew Tee 12
Yio Chu Kang 17
Yishun 41
one-north 45

We can also list all the 106 different categories of restaurants and eateries in the data extracted from Foursquare.

106 F&B categories.
['Chinese Restaurant' 'Bakery' 'Japanese Restaurant'
 'Vegetarian / Vegan Restaurant' 'Indian Restaurant' 'Burger Joint'
 'Seafood Restaurant' 'Korean Restaurant' 'German Restaurant'
 'Hotpot Restaurant' 'Italian Restaurant' 'Ramen Restaurant' 'Salad Place'
 'Café' 'Dumpling Restaurant' 'Sushi Restaurant' 'Halal Restaurant'
 'Asian Restaurant' 'Food Court' 'Cafeteria' 'Sandwich Place'
 'Fast Food Restaurant' 'Steakhouse' 'Dim Sum Restaurant' 'Soup Place'
 'Diner' 'Japanese Curry Restaurant' 'Bistro' 'American Restaurant'
 'Indonesian Restaurant' 'Buffet' 'Malay Restaurant' 'Pizza Place'
 'BBQ Joint' 'Restaurant' 'Portuguese Restaurant' 'Thai Restaurant'
 'Noodle House' 'Food Truck' 'Comfort Food Restaurant' 'Hainan Restaurant'
 'Fried Chicken Joint' 'Snack Place' 'Breakfast Spot' 'Food'
 'Hong Kong Restaurant' 'Wings Joint' 'Mexican Restaurant'
 'Middle Eastern Restaurant' 'Modern European Restaurant'
 'Shaanxi Restaurant' 'Hakka Restaurant' 'Gastropub'
 'Chinese Breakfast Place' 'Bagel Shop' 'Food Stand'
 'Vietnamese Restaurant' 'Cantonese Restaurant' 'Donut Shop'
 'French Restaurant' 'Peking Duck Restaurant' 'Spanish Restaurant'
 'Shabu-Shabu Restaurant' 'Filipino Restaurant' 'Mediterranean Restaurant'
 'Soba Restaurant' 'Swiss Restaurant' 'Deli / Bodega' 'Fish & Chips Shop'
 'Creperie' 'Molecular Gastronomy Restaurant' 'Fujian Restaurant'
 'Burmese Restaurant' 'Argentinian Restaurant' 'Poke Place'
 'Hot Dog Joint' 'Burrito Place' 'Theme Restaurant' 'African Restaurant'
 'Taiwanese Restaurant' 'Pet Café' 'Peruvian Restaurant'
 'Kebab Restaurant' 'Tapas Restaurant' 'Southern / Soul Food Restaurant'
 'Austrian Restaurant' 'Australian Restaurant' 'South Indian Restaurant'
 'Cha Chaan Teng' 'New American Restaurant' 'Beijing Restaurant'
 'Yunnan Restaurant' 'Dongbei Restaurant' 'Greek Restaurant'
 'Szechuan Restaurant' 'Manchu Restaurant' 'Cuban Restaurant'
 'Persian Restaurant' 'North Indian Restaurant' 'Macanese Restaurant'
 'Turkish Restaurant' 'Taco Place' 'Satay Restaurant' 'English Restaurant'
 'Churrascaria' 'Eastern European Restaurant']

Analysing the Eateries and Restaurants Around Each Station

Most Common Restaurant/Eatery

I analysed the list of venues within 500m of a station, and collated them according to their venue category. This way, I can figure out what are the top few types of restaurants/eateries for each station.

Station Name 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue
0 Admiralty Bakery Food Court Burger Joint Seafood Restaurant Noodle House Snack Place Fish & Chips Shop Diner Dongbei Restaurant Donut Shop
1 Aljunied Noodle House Chinese Restaurant Vegetarian / Vegan Restaurant Asian Restaurant Food Court Café Breakfast Spot Dim Sum Restaurant Seafood Restaurant Indian Restaurant
2 Ang Mo Kio Food Court Fast Food Restaurant Japanese Restaurant Sushi Restaurant Chinese Restaurant Asian Restaurant Sandwich Place Snack Place Breakfast Spot Burger Joint
3 Bartley Noodle House Indian Restaurant Asian Restaurant Seafood Restaurant Korean Restaurant Food Truck Café Yunnan Restaurant Fish & Chips Shop Dongbei Restaurant
4 Bayfront Café Bistro Japanese Restaurant Italian Restaurant Chinese Restaurant Noodle House Asian Restaurant Southern / Soul Food Restaurant Dumpling Restaurant Cantonese Restaurant

K Means Clustering

By using K Means Clustering, I clustered all the stations into 10 different groups, depending on their similarity of each station's top 10 restaurants/eatery categories. The following map shows the visulised clusters according to the colours.

Make this Notebook Trusted to load map: File -> Trust Notebook

Mapping Out all Restaurants of a Certain Category

One of the goals of this project is to provide entrepreneurs a easy-to-use map visualisation of all the restaurants in the City. From our earlier data extraction, we already have a comprehensive data set of the restaurants and eateries within a 500m radius of all the MRT stations in Singapore.

An entrepreneur may wish to open a restaurant serving a certain type of cuisine, and may wish to know the concentration of similar restaurants/eateries in the City. Hence, I wrote a function which can be easily called with the right parameters to map out similar restaurants.

The map also displays the MRT stations as red circles, and the eateries as either blue circles in the non-clustered map, or as popup markers in the clustered version.

Let us see it in action.

Indian Restaurants

Map with No Clustering

Make this Notebook Trusted to load map: File -> Trust Notebook

Map with Clustering

Make this Notebook Trusted to load map: File -> Trust Notebook

Interestingly, there seems to be very little Indian restaurants near the MRT stations in the heartland estates in the North, the Northeast, and also all the way to the West. Unsurprisingly, there is a large congregation of Indian restaurants around the Little India MRT Station.

Mexican Restaurants

Map with No Clustering

Make this Notebook Trusted to load map: File -> Trust Notebook
Make this Notebook Trusted to load map: File -> Trust Notebook

There are very few Mexican restuarants that are within 500m radius of a MRT train station. In fact, as seen from the map, our dataset shows that there are only 28 such restaurants. If demand can be ascertained, opening a Mexican restaurant can be venture to be considered.

Conclusions

There are numerous restaurants and eateries in Singapore serving all kinds of cuisines. However, as this analysis and visualisation has shown, the distribution of such eateries is not even. Furthermore, some cuisines have fewer restaurants, providing opportunity for new businesses. I hope that the tool in the accompaning notebook will help people in answering this business decision.

Further extensions include using a Premium Foursquare API to extract the ratings for all the restaurants in our dataset. This was not attempted in this exercise due to the prohibitive pricing.